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Foreword 



rd , 



This Technical Specification has been produced by the 3 Generation Partnership Project (3GPP). 

The contents of the present document are subject to continuing work within the TSG and may change following formal 
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an 
identifying change of release date and an increase in version number as follows: 

Version x.y.z 

where: 

X the first digit: 

1 presented to TSG for information; 

2 presented to TSG for approval; 

3 or greater indicates TSG approved document under change control. 

y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, 
updates, etc. 

z the third digit is incremented when editorial only changes have been incorporated in the document. 
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Scope 



The present document contains an electronic copy of the ANSI-C code for DSR Extended Advanced Front-end. The 
ANSI-C code is necessary for a bit exact implementation of DSR Extended Advanced Front-end. 



References 



The following documents contain provisions which, through reference in this text, constitute provisions of the present 
document. 

[1] ETSI ES 202 050: "Distributed Speech Recognition; Advanced Front-end Feature Extraction 

Algorithm; Compression Algorithm", Oct 2002. 

[2] ETSI ES 202 212 "Distributed Speech Recognition; Extended Advanced Front-end Feature 

Extraction Algorithm; Compression Algorithm, Back-end Speech Reconstruction Algorithm", 

Nov 2003. 

[3] 3GPP TS 26.177: "Speech Enabled Services (SES); Distributed Speech Recognition (DSR) 

extended advanced front-end test sequences". 



3 Definitions and abbreviations 

3.1 Definitions 

Definition of terms used in the present document, can be found in [1], [2] 

3.2 Abbreviations 

For the purpose of the present document, the following abbreviations apply: 

ANSI American National Standards Institute 

I/O Input/Output 

RAM Random Access Memory 

ROM Read Only Memory 

AFE Advanced Front-end 

X-AFE extended Advanced Front-end 

DSR Distributed Speech Recognition 



C code structure 



This clause gives an overview of the structure of the bit-exact C code and provides an overview of the contents and 
organization of the C code attached to this document. 

The C code has been verified on the following systems: 

Sun Microsystems workstations and GNU gcc compiler 

IBM PC compatible computers with Linux operating system and GNU gcc compiler. 

ANSI-C was selected as the programming language because portability was desirable. 

4.1 Contents of the C source code 

The distributed files with suffix "c" contain the source code and the files with suffix "h" are the header files. 
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Makefiles are provided for the platforms in which the C code has been verified (listed above). 



4.2 Program execution 



There are separate executables for the FrontEnd and Vector Quantization, with and without Extensions. The command 
line options are described below. 

<> - indicates parameters for the given option for running the executable 
- indicates default parameter. 

FrontEnd w/ Extension: 

USAGE: bin/ExtAdvFrontEnd infile HTK_outfile pitch_outfile class_outfile [options] 

OPTIONS: 

-q Quiet Mode (FALSE) 

-F format Input file format <NIST,HTK,RAW> (NIST) 

-fs freq Sampling frequency in kHz <8,16> (8) 

-swap Change input byte ordering (Native) 

-noh No HTK header to output file (FALSE) 

-nocO No cO coefficient to output feature vector (FALSE) 

-nologE No logE component to output feature vector (FALSE) 

-skip_header_bytes n - Skip header, first n bytes ( Only for -F RAW) 

-noh, -nocO, -nologE and -skip_header_bytes are not used and should not be changed. 

FrontEnd w/o Extension: 

USAGE: bin/AdvFrontEnd infile HTK_outfile [options] 
OPTIONS: - Same as FrontEnd w/ Extension 

Vector Quantization w/ Extension: 

Usage: extcoder htk_file_in pitch_file_in class_file_in bitstream_file_out pitch_file_out txt_file_out -freq x - 

VAD/No_VAD 
htk_file_in Input mel-frequency cepstral coefficient file in HTK MFCC format. 

pitch_file_in Input pitch period file. 
class_file_in Input classification file. 
bit_file_out Output binary bitstream. 

pitch_file_out Output quantised pitch period file. 
txt_file_out Vector quantiser output in text format, 

-freq x Sampling frequency in kHz (8 or 16). 

-VAD Use voice activity detector data. Voice activity input file must have same name as htk_file, but 

extension .vad 
-No_VAD Do not incorporate voice activity detector information in output bitstream. 

Vector Quantization w/o Extension: 

Usage: coder htk_file_in bitstream_file_out txt_file_out -freq x -VAD/No_VAD 

htk_file_in Input mel-frequency cepstral coefficient file in HTK MFCC format. 

bit_file_out Binary output bitstream. 

txt_file_out Vector quantiser output in text format. 

-freq x Sampling frequency in kHz (8 or 16). 

-VAD Use voice activity detector data. Voice activity input file must have same name as htk_file, but 

extension .vad 
-No_VAD Do not incorporate voice activity detector information in output bitstream. 

File extension descriptions as generated by the sample script: 

.cep - Binary file containing cepstral features in HTK format. Output from the FrontEnd, input to the vector quantizer, 
.pitch - Binary file containing pitch information. Output from the FrontEnd, input to the vector quantizer. Only used for 

Extension, 
.class - Ascii file containing class information. Output from the FrontEnd, input to the vector quantizer. Only used for 

Extension. 
.bs - Binary file containing the bitstream. Output from the vector quantizer, 
.log - Log files from the different executables. 
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4.3 Code hierarchy 

Tables 1 to 3 are call graphs that show the functions used for AFE (table 1), VQ (table 2), and Extension (table 3). 

Each column represents a call level and each cell a function. The functions contain calls to the functions in rightwards 
neighboring cells. The time order in the call graphs is from the top downwards as the processing of a frame advances. 
All standard C functions: printf(), fwrite(), etc. have been omitted. Also, no basic operations (add(), L_add(), mac(), 
etc.) or double precision extended operations (e.g. L_Extract()) appear in the graphs. 

The basic operations are not counted as extending the depth, therefore the deepest level in this software is level 7. 
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Table 1 : AFE call structure 



|main() 



AdvProces5lnit_B() 



DoNoiseSuplnit_B() 



DoWaveProclnit_B() 



DoCompCepslnit_B() 



DoPostProclnit_B() 



DoVADInit_F() 



Do16kProclnit_B() 



QMF_FIR_lnit_B() 



I AclvProcessAlloc_BO~ 



I FlushAdvProce5s_B() 



I AdvProcessDelete_B()~ 



I DoAdvProcess_B() 



firJnitializationBQ 



DP_HPJilters_B() 



Bufln32Alloc() 



DoNoiseSupAlloc_B() 



DoWaveProcAlloc_B() 



DoCompCepsAlloc_B() 
DoPostProcAlloc_B() 



DoVADAIIoc_F() 



Do16kProcAlloc_B() 



DoVADFIush_F() 



CvFeatlnt2Float() 



DoNoiseSupDelete_B() 



DoWaveProcDetete_B() 
DoCompCepsDelete_B() 
DoPostProcDelete_B() 



DoVADDelete_B() 



Bufln32Free() 



Pol 6kProcessing_B() 



DoNoiseSup_B() 



Get1 6k_p_bufferData1 6k_B() 



Get1 6k_bufData1 6kSize_B() 



Get16k_p_BandsForCoding16k_B() 



Get1 6k_p_CodeForBands1 6k_B() 



Get16k_dataHP_B() 



VAD_F() 



Log_2() 



DoSigWindowing16 F1() 




DoSigWindowing16 F2() 




ff4NRFix32 B() 






GetL15() 




GetH15() 




Mult16x32() 




Add Mult16x16 160 




Sub Mult16x16 160 




PermutO 


FFTtoPSD F() 






Square24d2 B() 




Square24 B() 


Get16k BFC dec B() 




GetBandsForGoding16k B{) 




PSDMean F() 




NoiseEstimation F1() 






Sqrt 2() 




Sqrt16 20 


NoiseEstimation F2() 






Sqrt 2() 




Sqrt16 20 


FilterCalc F() 




SpeechQVarO 




FilterBank16() 




SpeechQSpecO 




SpeechQMelO 




DoGainFact F1() 






Log 20 


DoGainFact F2() 






Log 20 


DoMellDCT F16() 




ApplyWFO 




Get16k dec1() 




Get16k dec2() 




Get16k dec3() 




DoSigWindowingie F3() 




ff4NRFix32 B() 






GetL15() 




GetH15() 




Mult16x32() 




Add Mult16x16 160 




Sub Mult16x16 160 




PermutO 


FFTtoPSD F() 






Square24d2 B() 




Square24 B() 


DoMelFB B() 




CodeBands16k B() 




DoSpecSub16k B() 






Log 2() 


UpDateDecalO 




ApplyDecalO 




DCOffsetFil F() 




Get16k hpBandsSize B() 
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I DoWaveProc_B() 



I DoCompCeps_B() 



Get1 6k_p_hpBands_B() 



Get1 6k_p_bufferCodeForBands1 6k_B() 
Get1 6k_p_CodeForBands1 6k_B() 



Get1 6k_p_bufferCodeWeights_B() 
Get1 6k_p_codeWeights_B() 



Set1 6k_hpBands_dec_B() 



TeagerEngQ 



GetTeagerFilterQ 



CepsComputeQ 



DoPostProc_B() 



DoVADProc_F() 



focalpointQ 



GetMaximaPositionsQ 



Get1 6k_p_bufferCodeWeights_B() 



Get1 6k_p_bufferCodeForBands1 6k_B() 
PreEmphHammQ 



tf4NB16_B() 



GetBandsForPecodingl 6k_B() 



DecodeBands16k_B() 



FilterBankQ 



Get1 6k_hpBands_dec_B() 



Get16k_p_hpBands_B() 



MergeSSandCoded_B() 



CorrectEnergyBQ 



Coslnv16Khz() 



coslnvQ (only for 8kHz) 



Table 2: VQ call structure 



main() 



quantize_and_print() 



get_best_dataframe() 



quant_pitch_abs() 



get_class_bit() 



quant_pitch_diff() 



get_class_bit() 



mfcc_crc_encode() 



pc_crc_encode() 



best_centroid() 
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Table 3: Extension call structure 

|main() 



RVC_ConstructPitchRom_be() 



RVC_ConstructPitchMeter_be() 



RVC_DestructPitchRom_be() 



RVC_DestructPitchMeter_be() 



I DoAdvProcess_B() 



Allocate_lnterpolated 
Dtt_be() 



RVC_ResetPitchMete 
r-_be() 



Deallocate_lnterpolat 
edDft_be() 



DoPitchExtractQ 



FilterBankQ 



dsr_afe_vad() 



get_vm() 



IsLowBandNoiseQ 



?t_zcm() 



pre_process() 



I RVC_MeasurePitch_be(~ 



fnLo92() 



iir_d() 



iir_s() 



ClearPitch_be() 



Dirichletlnterpolationb 
eO 



lsLowLevellnput_be() 



Finalize_be() 



IsContinuousPitc 
h_be() 



MpyJwswQ 



PrepareSpectral Peaks_ 
beO 



Mpy_lw_5w() I 



CalcSpectrumb 
eO 



FindPeaks_be() 



PrelimScaleDow 
nAmpsOfHighFre 
qPeak5_be() 



q50rt_be()' 



ComparelpointA 

mpbeQ 



RefineSpectraIPe 
aks_be() 



FindPitchCandidates_b 
Ml 



FinaLScaleDown 
AmpsOfHighFreq 
Peak5_be() 



MpyJwswQ 



MpyJwswQ 



Mpy lw_sw_Add( 



swapQ 



sqrtJJixQ 



NormalizeAmplitu 
des_be() 



CalcUtilityFunctio 
n_be() 



CreatePieceWise 
ConstantFunction 
_be() 



qsort_be()* 



Compare_ARRA 
Y_OF_XPOINTS 

beO 



LinkArrayOfPoint 
s_be() 



AddSortedArrayO 
tPoints_be() 



FindDominantLoc 
alMaximalnUtility 
FunctionbeQ 



UtilityFunctionAt 
GivenPitchFreq_ 
beO 



qsort_be()' 



ComparePitchFre 
qAscendingbeO 
SelectTopPitchC 
andidates_be() 



compute_pcorr_b 
eO 



ConvertLinkedLis 
tOfDiffPointsToUt 
ilFunc_be() 



L_Extract() 



Mpy_32_1 6() 



swapQ 



LinkArrayOfPoint 
s_be() 



Mpy JwswQ I 



swapQ 



Mpy JwswQ I 



£75/ 



3GPP TS 26.243 version 11.0.0 Release 11 



11 



ETSI TS 126 243 V1 1.0.0 (2012-10) 



|SelectFinalPitch_be() 





interpolate be() 






Mpy Iw sw() 




Mpy Iw lw(} 






sqrt 1 fix() 




find_most_energ 
etic window be(} 






accumulate be() 






find_most_energ 
etic window2 be 







Mpy_lw_sw() 





qsort_be()' 



ComparePitchFre 
qDescendmg_be( 

1 



ClearPitch_be() 



G0OD_ENOUG 
H_be() 



CLOSELY_LOCA 

TED_be() 



BETTER_be() 



IsContinuousPitc 
h_be() 



I cla5sify_frame() 



CalculateDoubleWindo 
wDft_be() 



swapQ 



MpyJwswQ 



MpyJwswQ 



* qsort_be() is a recursive function 
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4.5 Variables, constants and tables 

The data types of variables and tables used in the fixed point implementation are signed integers in 2's complement 
representation, defined by: 

- Wordl6 16 bit variable; 

- Word32 32 bit variable. 
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4.5.1 Description of constants used in tine C-code 



Table 5a: Global constants for AFE 



Constant 


Value 


Description 


NS SPEC ORDER 16K 


64 


Noise suppression Array length 


NS HANGOVER 16K 


15 


Noise suppression hangover count 


NS MIN SPEECH FRAME HANGOVER 16K 


4 


Noise suppression minmum speech frame hangover count 


NS ANALYSIS WINDOW 16K 


80 


Noise suppression analysis window 


PERC CODED 


0.7 


lambda merge (empirically set constant) 


LAMBDA NSE16k 


0.99 


Noise estimation Lambda 


NS NB FRAME THRESHOLD NSE 


100 


Noise suppression number of frame threshold used for NSE 


LENGTH QMF 


118 


QMF filter length 


f24 


1 


multiplier for OMF filter coefficients 


SHFF H 


8 


shift to get higher value 


L H 


16 


shift to get lower value 


HP16k MEL USED 


3 


Higher frequnecy band Mel used 


NB LP BANDS CODING 


3 


Lower frequency band used in coding 


NE16k FRAMES THRESH 


100 


Noise estimation frames threshold 


NB TOPOSTPROC 


12 


Number of coefficients to postprocess 


CEP FRAME LENGTH 


200 


Frame length for cepstral coefficients 


CEP NB COEF 


13 


Number of cepstral coefficients (including cO) 


CEP NB CHANNELS 


23 


Number of filters used for cepstral coefficients 


CEP FFT LENGTH 


256 


FFT length for cepstral coefficients 


FRAME BUF SIZE 


241 


Denoised Output buffer size 


FRAME SHIFT 


80 


WaveProcessing input frame shift 


FRAME LENGTH 


200 


WaveProcessing frame size 


NS SPEC ORDER 


65 


Noise suppression array length (8khz} 


NS BUFFER SIZE 


180 


Noise suppression past frame size 


NS FRAME SHIFT 


80 


Noise suppression input frame shift 


NS HALF FILTER LENGTH 


8 


Noise suppression filter half size 


NS NB FRAME THRESHOLD LTE 


10 


Noise suppression long term energy forgetting factor threshold (in frames) 


NS NB FRAME THRESHOLD NSE 


100 


Noise suppression spectrum estimate forgetting factor threshold (in frames) 


NS MIN FRAME 


10 


Number of frame threshold to update average energy for Nosie suppression VAD 


NS FFT LENGTH 


256 


FFT length for noise suppression 


WF MEL ORDER 


25 


Noise suppression Wiener filter order 


SHFT NOISE 


14 


shift applied to noise spectrum estimate 


SHFT FACT MUL 


14 


shift applied to gain coefficient (nosie suppression gain factoriization) 


IDCT ORDER 


25 


Noise suppression idct order 


NS BETA 


0.98 


Noiseless signal suppression factor 


NS RSB MIN 


0.079432823 


Minimum a priori SNR 


NS LAMBDA NSE 


0.99 


Forgetting factor for noise spectrum estimate 


NS LOG SPEC FLOOR 


-10.0 


average energy minimum threshold 


NS SNR THRESHOLD VAD 


15 


SNR threshold for noise suppression VAD 


NS SNR THRESHOLD UPD LTE 


20 


Long term energy update threshold for noise suppression VAD 


NS ENERGY FLOOR 


80 


Energy Minimum threshold for noise suppression VAD 


MaxPos 


10 


Maximum number of maxima in waveprocessing 


WP EPS 


0.2 


weigthing value added or substracted for waveprocessing 



Table 5b: Global constants for VQ 



Constant 


Value 


Description 


MIN PERIOD 


1245184 


Minimum pitch period allowed 


MAX PERIOD 


9175040 


Maximum pitch period allowed 


NUM MULTI LEVELS 1 


26 


number of levels in pitch quantization 


NUM MULTI LEVELS 2 


24 


number of levels in pitch quantization 


UNVOICED CODE 





init value for Opindex 



Table 5c: Global constants for Extension 



Constant 


Value 


Description 


HISTORY LEN 


100 


History length - past samples for pitch extraction 


DOWN SAMP FACTOR 


4 


Down-sampling factor - used in computing correlation 


NO OF DFT POINTS 


128 


Number of DFT points 


BREAK POINT 


12 


Break point - marks the end of low frequency band 


LBN HIST WEIGHT 


32440 


Low band noise history weight 


LBN CURR WEIGHT 


328 


Low band noise current weight (32768 - LBN HIST WEIGHT) 


LBN MAX THR 


124518 


Low band noise maximum threshold 


LBN LOW ENR LEVEL MANT 


32000 


Low band noise low energy level mantissa 


LBN LOW ENR LEVEL SHFT 


22 


Low band noise low energy level shift 


RVC OK 





Return code for success 


RVC ERR 


-1 


Return code for unspecified error 


RVC ERR NOT ENOUGH MEMORY 


-2 


Return code for not enough memory 


RVC ERR ILLEGAL ARGUMENT 


-3 


Return code for an illegal input / output argument 


RVC ERR 10 FAILED 


-4 


Return code for failed input / output to a file 


RVC ERR BAD FILE FORMAT 


-5 


Return code for a bad file header 


RVC ERR NOT INITIALIZED 


-6 


Return code for failure due to improper initialization 


RVC ERR ILLEGAL USAGE 


-7 


Return code for illegal usage of a function 


RVC ERR NOT ENOUGH SAMPLES 


-8 


Return code for insufficient number of samples 


RVC ERR NOT IMPLEMENTED 


-9 


Return code for an unimplemented function 
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RVC ERR FAIL OPEN FILE 


-10 


Return code for failure to open a file 


UB ENRG FRAC 


59 


Upper band energy fraction 


ZCM THLD 


87 


Zero crossing measure tfiresfiold 


SORT ONE HALF 


Ox5A82 


Square root of 0.5 (0.707) 


FRAME LEN DS 


50 


Frame lengthi downsampled (200/4} 


FRAME LEN DS BY 2 


25 


Frame length) downsampled divided by 2 


HISTORY LEN DS 


25 


History length downsampled (100/4) 


WINDOW LENGTH 


18 


Window lengtti used in computing correlation 


INV WINDOW LENGTH 


1820 


Inverse of window lengtti (1/18 = 0.05556) 


NUM CHAN 


23 


Number of ctiannels or Mel-frequency bands 


MIN CH ENRG MANTISSA 


20000 


Minimum ctiannel energy mantissa 


MIN OH ENRG SHIFT 


25 


Minimum ctiannel energy stiift 


INIT SIG ENRG MANTISSA 


30518 


Initial signal energy mantissa 


INIT SIG ENRG SHIFT 


8 


Initial signal energy sfiift 


CE SM FAG 


18022 


Ctiannel energy smoottiing factor 


CE SM FAG COMPL 


14746 


Ctiannel energy smoottiing factor complement 


ONE SM FAG 


3277 


Ctiannel noise energy smoottiing factor 


CNE SM FAG GOMPL 


29491 


Ctiannel noise energy smoottiing factor complement 


LO GAMMA 


22938 


Low gamma value 


LO GAMMA COMPL 


9830 


Low gamma value complement 


HI GAMMA 


29491 


Higti gamma value 


HI GAMMA COMPL 


3277 


Higti gamma value complement 


LO BETA 


31130 


Low beta value 


HI BETA 


32702 


High beta value 


INIT FRAMES 


10 


Initial number of frames (considered to be noise frames) 


SINE START CHAN 


4 


Sine start channel (for sine wave detection) 


PEAK TO AVE THLD 


10 


Peak to average threshold 


DEV THLD 


1523942 


Deviation threshold 


HYSTER CNT THLD 


9 


Hysteresis count threshold 


F UPDATE CNT THLD 


500 


Forced update count threshold 


NON SPEECH THLD 


32 


Non-speech threshold 


FIX 34 


24576 


(short) (32768.0 * 3.0/4.0) 


FIX 18 


4096 


(short) (32768.0 * 1 .0/8.0) 


FIX INVS0RT2 


-23170 


1 / sqrt(2) 


swTHIRD REF BANDWIDTH 


85 


One third of the reference bandwidth 


swTWO THIRDS REF BANDWIDTH 


171 


Two thirds of the reference bandwidth 


MIN ENERGY MANTISSA 


25600 


Minimum energy mantissa 


MIN ENERGY SHIFT 


18 


Minimum energy shift 


swREF SAMPLE RATE QO 


0x1 F40 


Reference sampling rate in OO format 


swCLOSE FACTOR 014 


0x4CCD 


Closeness factor in 014 format 


swFD SCORE THLD1 015 


0x63D7 


Frequency domain score threshold 1 in 01 5 format 


swFD SCORE THLD2 015 


0x570A 


Frequency domain score threshold 2 in 01 5 format 


swCORR THLD 015 


0x651 F 


Correlation threshold in 01 5 format 


swSUM THLD 014 


0x6667 


Sum threshold in 014 format 


IwCRITO OFFSET 015 


0x00001 70 A 


Offset for finding a better pitch candidate in 01 5 format 


swCANDCORR THLD1 01 5 


0x799A 


Pitch candidate correlation threshold 1 in 01 5 format 


swCANDCORR THLD2 015 


0x599A 


Pitch candidate correlation threshold 2 in 01 5 format 


swCANDCORR THLD3 015 


OxSCCD 


Pitch candidate correlation threshold 3 in 015 format 


swCANDAMP THLD3 015 


0x68F6 


Pitch candidate amplitude threshold 3 in 01 5 format 


swSTARTFREQ COEFF 


0x553F 


Start frequency coefficient (for candidate search) 


swENDFREO COEFF 


0x4666 


End frequency coefficient (for candidate search) 


DIRICHLET KERNEL SPAN 


8 


Direchlet kernal span (for interpolation) 


REF SAMPLE RATE 


8000 


Reference sampling rate 


REF BANDWIDTH 


4000 


Reference bandwidth 


IwTHIRD REF BANDWIDTH 


87381333 


One third of the reference bandwidth 


IwTWO THIRDS REF BANDWIDTH 


174762667 


Two thirds of the reference bandwidth 


swCENTER WEIGHT 


0x5000 


Center weight 


swSIDE WEIGHT 


0x1800 


Side weight 


swAMP SCALE D0WN1 


0x5333 


Amplitude scale down factor 1 


swAMP SCALE D0WN2 


0x399A 


Amplitude scale down factor 2 


swAMP SCALE D0WN2b 


0x7333 


Amplitude scale down factor 2b 


swUDISTI 


-4160 


Utility function distance 1 


SWUDIST2 


-6400 


Utility function distance 2 


swUSTEP 


-16384 


Utility function step 


swFREO MARGIN1 


0x4AE1 


Frequency margin 1 


swAMP MARGIN1 


0x07AE 


Amplitude margin 1 


swAMP MARGIN2 


0x07AE 


Amplitude margin 2 


MIN STABLE FRAMES 


6 


Minimum number of stable frames 


MAX TRACK GAP FRAMES 


2 


Maximum pitch track gap frames 


swSTABLE FREO UPPER MARGIN 


Ox4E14 


Stable frequency upper margin 


swSTABLE FREO LOWER MARGIN 


0x68EB 


Stable frequency lower margin 


UNVOICED 





Pitch frequency of an unvoiced frame 


IwMAX PITCH FREO 


Ox01A40000L 


Maximum pitch frequency 


IwMIN PITCH FREO 


0x00340000L 


Minimum pitch frequency 


MAX PITCH FREO 


420 


Maximum pitch frequency in Hz 


MIN PITCH FREO 


52 


Minimum pitch frequency in Hz 


HIGHPASS CUTOFF FREO 


300 


Highpass cut-off frequency in Hz 


NO OF FRACS 


77 


Number of fractions in the frations table 


IwSHORT WIN START FREO 


0X00C80000L 


Short window start frequency 


IwSHORT WIN END FREO 


0x01A40000 


Short window end frequency 


IwSINGLE WIN START FREO 


0x00640000L 


Single window start frequency 


IwSINGLE WIN END FREO 


0x00D20000L 


Single window end frequency 


IwDOUBLE WIN START FREO 


0x00340000 


Double window start frequency 


IwDOUBLE WIN END FREO 


0X00780000L 


Double window end frequency 


MAX LOCAL MAXIMA ON SPECTRUM 


70 


Maximum number of local maxima on the spectrum 


MAX PEAKS FOR SORT 


30 


Maximum number peaks for sorting 


MAX PEAKS PRELIM 


7 


Maximum number of peaks (preliminary) 


MIN PEAKS 


7 


Minimum number of peaks 


MAX PEAKS FINAL 


20 


Maximum number of peaks (final) 


MAX PRELIM CANDS 


4 


Maximum number of preliminary candidates (pitch) 


CREATE PIECEWISE FUNC LOOP LIM SH 


20 


Create Piecewise function loop limit for short window 


CREATE PIECEWISE FUNC LOOP LIM SNG 


30 


Create Piecewise function loop limit for single window 


CREATE PIECEWISE FUNC LOOP LIM DBL 


60 


Create Piecewise function loop limit for double window 
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SwSUM FRACTION 


0x799A 


Sum fraction 


swAMP FRACTION 


0x33F8 


Amplitude fraction 


MAX BEST CANDS 


2 


Maximum number of best candidates (pitcti) 


N OF BEST CANDS SHORT 


2 


Number of best candidates for stiort window 


N OF BEST CANDS SINGLE 


2 


Number of best candidates for single window 


N OF BEST CANDS DOUBLE 


2 


Number of best candidates for double window 


N OF BEST CANDS 


6 


Number of best candidates for all windows 


SIZE_SCRATCH_DOPITCH 


1090 


Scratch memory size for DoPitch(} function (Tills is ttie actual size required. Ttie 
declared size in C simulation is 1632) 


SIZE_SCRATCH_ADVPROCESS 


825 


Scratch memory size for DoAdvProcessO function (This is the actual size required. 
The declared size in C simulation is 1 100) 


RVC PITCH ROM SIG 


11031 


Signature for RVC PITCH ROM structure 


RVC PITCH METER SIG 


21053 


Signature for RVC PITCH METER structure 
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4.5.2 Description of fixed tables used in tine C-code 

This section contains a listing of all fixed tables sorted by source file name and table name. All table data is declared as 
Wordl6. 

Table 6a: Fixed tables for AFE 



File 


Table Name 


Length 


Description 


16kHzProcessing B.c 


table pow2 


33 


Table for square root 




LambdaNSEx2 


100 


Table used to compute first 100 LambdaNSE 




dp02 h 


59 


MSB of QMF filter coefficients 




dp02 1 


43 


LSB of QMF filter coefficients 


PostProc B.c 


targetLMS16 


12 


Target for blind equalization 


ComCeps B.c 


HalfHammingie 


100 


Hamming window coefficients 




CosMatrix16 


144 


Inverse cosinus coefficients at 8Ktiz (not used at 16kfiz) 




CosMatrix16 16khz 


156 


Inverse cosinus coefficients at 1 6Ktiz 




pond MelFi Iter 


309 


r^el bank coefficients 


ff4nrFix16 B.c 


tabSin 


64 


Sine table 




tabCos 


64 


Cosine table 


MathFunc.c 


tbIntO 


48 


Coefficients for computation of square root 


ExtNoiseSup B.c 


lambda 1divX 


20 


Computation of 1/N 




Hann sti32 tii 


100 


MSB of tianning window coefficients (32 bits) 




Hann sh32 lo 


100 


LSB of tianning window coefficients (32 bits) 




Hann sh24 tii 


100 


MSB of Manning window coefficients (24 bits) 




Hann sh24 lo 


100 


LSB of fianning window coefficients (24 bits) 




pond MelFi IterNoise 


157 


Mel-frequency scale coefficients (applied to ttie Wiener filter) 




idctMel16 


234 


Mel-warped inverse DOT coefficients 




pondMelFilter16k 


134 


Filter bank coefficients at 1 6Ktiz 




Ml LamdaLTE 


8 


Computation of 1/N 




Ml LambdaNSEx2 


100 


Computation of 2/N 




Ml LamdaNSE 


9 


Computation of 1/N 




mlnvLambdaie 


10 


Comutation od 2/N 



Table 6b: Fixed tables for VQ 



File 


Table Name 


Length 


Description 


coder VAD.c 


quantizer16kHz 1 


128 


vq table 




quantizerl6kHz 2 3 


128 


vq table 




quantizer16kHz 4 5 


128 


vq table 




quantizer16kHz 6 7 


128 


vq table 




quantizer16kHz 8 9 


128 


vq table 




quantizer16kHz 10 11 


64 


vq table 




quantizerl6kHz 12 13 


512 


vq table 




quantizer8kHz 1 


128 


vq table 




quantizer8kHz 2 3 


128 


vq table 




quantizer8kHz 4 5 


128 


vq table 




quantizer8kHz 6 7 


128 


vq table 




quantizer8kHz 8 9 


128 


vq table 




quantizer8kHz 10 11 


64 


vq table 




quantizer8kHz 12 13 


512 


vq table 




weigtit16kHz cO stiift 




vq weigtits 




weight16kHz cO norm 




vq weigtits 




weight16kHz logE 




vq weigtits 




weightSkHz cO shift 




vq weigtits 




weightSkHz cO norm 




vq weigfits 




weightSkHz logE 




vq weigfits 




plwQuantLevels[1 27] 


127*2 


vq tables for pitcti/class quantization 




ppplwQuantSections[81[3] 


24*2 


vq tables for pitcti/class quantization 




plwQuantLevels[31] 


31*2 


vq tables for pitch/class quantization 




pplwQuantSections[4][3] 


12*2 


vq tables for pitch/class quantization 




pswRatioTtild 1[4][6I 


24 


vq tables for pitch/class quantization 




piMultiLevellndex[4] 


4 


vq tables for pitch/class quantization 




pswRatioTtild 2[4][8] 


32 


vq tables for pitcti/class quantization 




piMultlLevellndex 2[4] 


4 


vq tables for pitcti/class quantization 




swAlptial 


1 


pitcti/class constants 




swAlptia2 


1 


pitcti/class constants 
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Table 6c: Fixed Tables for Extension 



File 


Table name 


Length 


Description 


ExtNoiseSup B.c 


pswPePower 


129 


Coefficients to compute the pre-emphasis power spectrum 


preProc B.c 


pswHpfCoef 


15 


High pass filter coefficients 


preProc B.c 


pswLpfCoef 


15 


Low pass filter coefficients 


preProc B.c 


pswLfeCoef 


3 


Low frequency emphasis filter coefficients 


dsrAfeVad B.c 


piBurstConst 


20 


Burst length constants for different SNR's 


dsrAfeVad B.c 


piHangConst 


20 


Hang length constants for different SNR's 


dsrAfeVad B.c 


piVADThId 


20 


VAD voice metric thresholds for different SNR's 


dsrAfeVad B.c 


piVMTable 


90 


Voice metric table as a function of SNR index 


dsrAfeVad B.c 


piSigThId 


20 


Signal threshold table as a function of SNR 


dsrAfeVad B.c 


piUpdateThId 


20 


Update threshold table as a function of SNR 


dsrAfeVad B.c 


pswShapeTabie 


23 


Spectral shape correction table 


fix matfilib.c 


coeff sqrtS 58 


5 


Coefficients for computation of square root 


fix matfilib.c 


coeff sqrt5 78 


5 


Coefficients for computation of square root 


rvc pitch init B.ti 


ROIVI astFrac 


312 


Fractions table 


rvc pitch init B.h 


ROM pstWindowshiftTable 


514 


Complex exponents table for time shifting in frequency domain 


rvc_pitch init B.h 


ROM aswDirichletlmaq 


8 


Imaginary part of the Dirichlet kernel 



4.5.3 Static variables used in tlie C-code 

In this section two tables that specify the static variables for the AFE, VQ, and Extension respectively are shown. 
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Table 7a: AFE static variables 



Struct Name 


Variable 


Type[Length] 


Description 


QMF FIR 










lengthQMF 


Word32 


QMF Filter length 




*dp 1 


Word16 


OMF filter low frequency Coeff 




*dp h 


Word16 


OMF filter high frequency Coeff 




*T 


Word16 


Temporary OMF filter buffer 




T dec 


Wordie 


Multiplier for T 


DataFor16kProc B 










FrameLength 


Word32 


Input Frame length 




FrameShift 


Word32 


Shift value for the frame 




numFrameslnBuffer 


Word32 


Number of frames in buffer 




Sampiing Frequency 


Word32 


Sampling frequency (8/1 6) 




Do16kHzProc 


BOOLEAN 


Flag to enable 16kHz processing 




"hpBands B 


Word32 


Buffer for HP bands 




hpBandsSize 


Word32 


hpBands B buffer size 




CodeForBandslSk B 


Word32[9] 


HP coding buffer 




bufferCodeForBands16k B 


Word32[27] 


buffer used for HP coding 




codeWeights B 


Word16[3] 


code Weights buffer 




bufferCodeWeights B 


Word16[9] 


buffer used for code Weights 




* pQMF Fir 


OIVIF FIR 


Pointer to QMF FIR structure 




*bufferData16k B 


Word32 


temporary buffer to carry OMF LP data 




bufData16kSize 


Word32 


16k data buffer size 




•FirstWindowl 6k 


MelFB Window 


pointer to MelFB Window structure 




noiseSE16k B 


Word32[31 


noise spectrul energy variable 




noise dec 


Word16 


Multiplier for noiseSEI 6k B 




BandsForCodinglSk B 


Word32[9] 


buffer for storing Bands for Coding 




vadCounter16k 


Word32 


vad flag counter 




vad16k 


Word32 


vad flag 




nbSpeechFramesI 6k 


Word32 


number of speech frames counter 




hangOverl 6k 


Word32 


hang over used for VAD 




meanEnlBk 


Word32 


mean Energy variable 




nb frame threshold nse 


Word32 


threshold NSE for frame 




lambda nse 


Word16 


lambda NSE variable 




"dataHP B 


Word32 


buffer stores OMF HP value 




dec 16k 


Word16[5] 


Multiplier for dataHP B buffer 




BFC dec 


Word16[1] 


Multiplier for computing bands for coding 




fb16k dec 


Word16[3] 


Buffer is used to store multiplier for current and pervious two frames 


PostProcStructX 










weightLMS 


Word32[12] 


Current LMS weight 


CompCepsStructX 










FFTLength 


Word32 


FFT size 




Do16khzProc 


Word16 


Flag to enable 16kHz processing 




*pData16k 


Word32 


Pointer to data for leKhz processing 


WaveProcStructX 










*TeagerFilter16 


Word32 


Pointer to teager filter 




*TeagerWindow32 


Word32 


Pointer to teager window 




TeagerOnset 


Word32 


Unused 




FrameLength 


Word32 


Input frame length 


ns var F 










SampFreq 


Wordie 


Sampling frequency (8/1 6) 




Do16khzProc 


Wordie 


Flag to enable 16kHz processing 




buffers. nbFrameslnFirstStage 


Word32 


number of frames in first stage 




buffers. nbFrameslnFirstStage 


Word32 


number of frames in second stage 




buffers. nbFramesOutSecondStage 


Word32 


number of frames out og second stage 




buffers. FirstStagelnlBBuffer 


Wordie[180] 


First stage buffer 




buffers.SecondStagelnBuffer32 


Word32[180] 


Second stage buffer 




buffers. SecondDecalSig 


Wordie[4] 


Shift factor for each sub-frame of second stage buffer 




prevSamples32.lastSampleln32 


Word32 


Last input sample of DC offset compensation 




prevSamples32.lastDCOut32 


Word32 


last output sample of DC offset compensation 




prevSamples32. oldShift 


Wordie 


Iprevious window shift factor of DC offset compensation 




spectrum. indexBufferl 


Wordie 


Where to enter new PSD for first stage, alternatively and 1 




spectrum. indexBuffer2 


Wordie 


Where to enter new PSD for second stage, alternatively and 1 




spectrum. noiseSEI 32 


Word32[e5] 


Noise spectrum estimate for first stage 




spectrum. noiseSEI dec 


WordieieS] 


Shift factor for Noise spectrum estimate (first sage) 




spectrum. noiseSE2 32 


Word32[e5] 


Noise spectrum estimate for second stage 




spectrum. noiseSE2 dec 


wordieies] 


Shift factor for Noise spectrum estimate (second sage) 




spectrum.PSDIVIeanAntBufferl 


Word32[e5] 


1 St stage PSD Mean buffer for precedent frame 




spectrum. nSigSEI Ant dec 


Wordie[e5] 


Shift factor for PSD Mean buffer for precedent frame (1 rst stage) 




spectrum.PSDIVIeanAntBuffer2 


Word32[e5] 


2nd stage PSD Mean bufferfor precedent frame 




spectrum. nSigSE2Ant dec 


Wordie[e5] 


Shift factor for PSD Mean buffer for precedent frame (2nd stage) 




spectrum. denSigSEI 32 


Word32[e5] 


1 St stage PSD Mean buffer 




spectrum. nSigSEI Cur dec 


Wordie[e5] 


Shift factor for PSD Mean buffer (1 rst stage) 




spectrum. denSigSE2 32 


Word32[e5] 


2nd stage PSD Mean buffer 




spectrum. nSigSE2Cur dec 


Wordie[e5] 


Shift factor for PSD Mean buffer (2™ stage) 




vad data ns F. nbFrame 


Wordie[2] 


Nubmer of frames (for the 2 stages) 




vad data ns F. flagVAD 


Wordie 


Vad Flag (1 = SPEECH, = NON SPEECH) 




vad data ns F.hangOver 


Wordie 


hangover 




vad data ns F. nbSpeechFrames 


Wordie 


Number of speech frames (used to set hangover) 




vad data ns F.meanEn32 


Word32 


Mean energy for VAD 




vad data ca. flagVAD 


Wordie 


Vad Flag (1 = SPEECH, = NON SPEECH) 




vad data ca.hangOver 


Wordie 


hangover 




vad data ca. nbSpeechFrames 


Wordie 


Number of speech frames (used to set hangover) 




vad data ca.meanEn32 


Word32 


Mean energy for VAD 




vad data fd.lVlelMean 


Wordie 


SpeechOMel (for frame dropping) 




vad data fd.VarI\/lean 


Word32 


SpeechOVar (for frame dropping) 
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vad data fd.AccTest 


Word32 


SpeechQSpec (for frame dropping) 




vad data fd.AccTest2 


Word32 






vad data fd.SpecMean 


Word32 


SpecMean (for frame dropping) 




vad data fd.MelValues 


Word16[2] 


SpeechQMel (for frame dropping) 




vad data fd.SpecValues 


Word32 


SpeechQSpec (for frame dropping) 




vad data fd.SpeechlnVADQ 


Word 16 


Flag (for frame dropping) 




vad data fd.Speech!nVADQ2 


Word 16 


Flag (for frame dropping) 




gainFact.logDenEnI 32 


Word32[3] 


Denoise frame energy for gain factorization 




gainFact.lowSNRtrack32 


Word32 


Low SNR level for gain factorization 




gainPact. alfaGF16 


Word 16 


Wiener filter gain factorization coefficient 


VADStructX F 










Focus 


Word 16 


Position of circular buffe 




HangOver 


Word16 


Hangover length 




FlushFocus 


Word 16 


Position in circular buffer when emptying at end 




H CountDown 


Word16 


Main hangover countdown 




V CountDown 


Word16 


Short hangover countdown 




**OutBuffer 


Word32 


OutBuffer pointer pointer 




'OutBuffer 


Word32[7I 


OutBuffer pointer 




OutBuffer 


Word16[7x15] 


OutBuffer 



Table 7b: VQ static variables 



Struct Name 


Variable 


Type [Length] 


Description 


coder VAD.c 


four {rames[27] 


Worcl16[27] 


Previous frames used to build muitiframe 




plwQPHistory[3] 


Word32[3] 


History of Pitch 




IReliableFlag 


Word16 


Pitcfi reliability flag 
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Table 7c: Extension static variables 



Struct Name 


Variable 


Type[Length] 


Description 




IFirstFrameFlag 


Word16 


Pirst frame flag 




pswUBSpeech 


WQrd16[200] 


Upper band speecfi 




pswDownSampledProcSpeech 


Word16[75] 


Down-sampled processed speecfi 




IwCritMax 


Word32 


Maximum power ratio 




iOldPitchPeriod 


Wordie 


Old pitcfi period value 




iOldPrameNo 


Word 16 


Old frame number 


PCORR STATE be 


s be 








lwX1 X1 


WQrd32 


X1*X1 




lwZ1 Z1 


Word32 


Z1*Z1 




lwZ2 Z2 


Word32 


Z2*Z2 




lwX1 Z1 


Word32 


X1*Z1 




lwX1 Z2 


Word32 


X1*Z2 




lwZ1 Z2 


Word32 


Z1*Z2 




swXI Sum 


Wordie 


Sum of XI 




swZI Sum 


Wordie 


Sum of Z1 




swZ2 Sum 


Wordie 


Sum of Z2 




iBurstConst 


Wordie 


Burst constant 




iBurstCount 


Wordie 


Burst count 




iHangConst 


Wordie 


Hang constant 




iHangCount 


Wordie 


Hang count 




iVADThId 


Wordie 


VAD threstiold 




iPrameCount 


Wordie 


Prame count 




iPUpdatePlag 


Wordie 


Forced update flag 




iHysterCount 


Wordie 


Hysteresis count 




iLastUpdateCount 


Wordie 


Last update count 




iSigThId 


Wordie 


Signal tfiresfiold 




iUpdateCount 


Wordie 


Update count 




iChanEnrgShift 


Wordie 


Cfiannel energy shift 




iChanNoiseEnrgShift 


Wordie 


Cfiannel noise energy sfiift 




pswChanEnrg 


Wordie[23] 


Channel energy 




pswChanNoiseEnrg 


Wordie[23] 


Channel noise energy 




swBeta 


Wordie 


Beta value 




swSnr 


Wordie 


SNR value 


NormSw 


pnsLogSpecEnrgLong 








swMantissa 


Wordie[23] 


Mantissa 




iShift 


Wordie[23] 


Shift 




swCO 


Wordie 


CO value 




swC1 


Wordie 


CI value 




swC2 


Wordie 


C2 value 




pswHpfXState 


wordie[ei 


High pass filter input state 




pswHpfYState 


Wordie[12] 


High pass filter output state 




pswLpfXState 


wordieie] 


Low pass filter input state 




pswLpfYState 


Wordie[12] 


Low pass filter output state 




pswLfeXState 


Wordie 


Low frequency emphasis filter input state 




pswLfeYState 


Wordie[2] 


Low frequency emphasis filter output state 
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5 File formats 

This section describes the file formats used by the APE, VQ & Extension programs. 



5.1 Speech file 



Speech files read by the X-AEE and written by the Extension consist of 16-bit words. The byte order depends on the 
host architecture (e.g. MSByte first on SUN workstations, LSByte first on PCs etc) 
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Annex A (informative): 
Change history 



Change history 


Date 


TSG# 


TSG Doc. 


CR 


Rev 


Subject/Comment 


Old 


New 


2004-06 


24 


SP-040343 






Version 6.0.0 approved at 3GPP TSG SA#24 


2.0.0 


6.0.0 


2004-12 


26 


SP-040837 


001 


1 


Software bug correction: Removal of Basicops 
simulation of "0" shift operator 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


002 


1 


Software bug correction: Initialization of the variables 
Iwc and i2aScale 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


003 


1 


Software bug correction: Wrong assignment of the 
variables *piReliableFlag and *pcQPIndex 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


004 


2 


Software bug correction: Use of incorrect variable 
fRef Period instead of IRef Period 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


005 




Add reference to test sequences document 


6.0.0 


6.1.0 


2007-06 


26 








Version for Release 7 


6.1.0 


7.0.0 


2008-12 


42 








Version for Release 8 


7.0.0 


8.0.0 


2009-12 


46 








Version for Release 9 


8.0.0 


9.0.0 


2011-03 


51 








Version for Release 10 


9.0.0 


10.0.0 


2012-09 


57 








Version for Release 1 1 


10.0.0 


11.0.0 
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